Search results for "latent Dirichlet allocation"
showing 10 items of 10 documents
The Use of Artificial Intelligence in Disaster Management - A Systematic Literature Review
2019
Whenever a disaster occurs, users in social media, sensors, cameras, satellites, and the like generate vast amounts of data. Emergency responders and victims use this data for situational awareness, decision-making, and safe evacuations. However, making sense of the generated information under time-bound situations is a challenging task as the amount of data can be significant, and there is a need for intelligent systems to analyze, process, and visualize it. With recent advancements in Artificial Intelligence (AI), numerous researchers have begun exploring AI, machine learning (ML), and deep learning (DL) techniques for big data analytics in managing disasters efficiently. This paper adopt…
From user-generated data to data-driven innovation: A research agenda to understand user privacy in digital markets
2021
Abstract In recent years, strategies focused on data-driven innovation (DDI) have led to the emergence and development of new products and business models in the digital market. However, these advances have given rise to the development of sophisticated strategies for data management, predicting user behavior, or analyzing their actions. Accordingly, the large-scale analysis of user-generated data (UGD) has led to the emergence of user privacy concerns about how companies manage user data. Although there are some studies on data security, privacy protection, and data-driven strategies, a systematic review on the subject that would focus on both UGD and DDI as main concepts is lacking. There…
The Impact of COVID-19 on Sport in Twitter: A Quantitative and Qualitative Content Analysis
2021
The spread of the SARS-CoV-2 virus has transformed many aspects of people’s daily life, including sports. Social networks have been flooded on these issues. The present study aims to analyze the tweets produced relating to sports and COVID-19. From the end of January to the beginning of May 2020, over 4,000,000 tweets on this subject were downloaded through the Twitter search API. Once the duplicates, replicas, and retweets were removed, 119,253 original tweets were analyzed. A quantitative–qualitative content analysis was used to study the selected tweets. Posts dynamics regarding sport and exercise evolved according to the COVID-19 pandemic and subsequent lockdown, shifting from consideri…
Object Matching in Distributed Video Surveillance Systems by LDA-Based Appearance Descriptors
2009
Establishing correspondences among object instances is still challenging in multi-camera surveillance systems, especially when the cameras’ fields of view are non-overlapping. Spatiotemporal constraints can help in solving the correspondence problem but still leave a wide margin of uncertainty. One way to reduce this uncertainty is to use ap- pearance information about the moving objects in the site. In this paper we present the preliminary results of a new method that can capture salient appearance characteristics at each camera node in the network. A Latent Dirichlet Allocation (LDA) model is created and maintained at each node in the camera network. Each object is encoded in terms of the…
Entropy-based Localization of Textured Regions
2011
Appearance description is a relevant field in computer vision that enables object recognition in domains as re-identification, retrieval and classification. Important cues to describe appearance are colors and textures. However, in real cases, texture detection is challenging due to occlusions and to deformations of the clothing while person's pose changes. Moreover, in some cases, the processed images have a low resolution and methods at the state of the art for texture analysis are not appropriate. In this paper, we deal with the problem of localizing real textures for clothing description purposes, such as stripes and/or complex patterns. Our method uses the entropy of primitive distribu…
Multi-label Classification Using Stacked Hierarchical Dirichlet Processes with Reduced Sampling Complexity
2018
Nonparametric topic models based on hierarchical Dirichlet processes (HDPs) allow for the number of topics to be automatically discovered from the data. The computational complexity of standard Gibbs sampling techniques for model training is linear in the number of topics. Recently, it was reduced to be linear in the number of topics per word using a technique called alias sampling combined with Metropolis Hastings (MH) sampling. We propose a different proposal distribution for the MH step based on the observation that distributions on the upper hierarchy level change slower than the document-specific distributions at the lower level. This reduces the sampling complexity, making it linear i…
Online Sparse Collapsed Hybrid Variational-Gibbs Algorithm for Hierarchical Dirichlet Process Topic Models
2017
Topic models for text analysis are most commonly trained using either Gibbs sampling or variational Bayes. Recently, hybrid variational-Gibbs algorithms have been found to combine the best of both worlds. Variational algorithms are fast to converge and more efficient for inference on new documents. Gibbs sampling enables sparse updates since each token is only associated with one topic instead of a distribution over all topics. Additionally, Gibbs sampling is unbiased. Although Gibbs sampling takes longer to converge, it is guaranteed to arrive at the true posterior after infinitely many iterations. By combining the two methods it is possible to reduce the bias of variational methods while …
Using Topic Modeling Methods for Short-Text Data: A Comparative Analysis
2020
With the growth of online social network platforms and applications, large amounts of textual user-generated content are created daily in the form of comments, reviews, and short-text messages. As a result, users often find it challenging to discover useful information or more on the topic being discussed from such content. Machine learning and natural language processing algorithms are used to analyze the massive amount of textual social media data available online, including topic modeling techniques that have gained popularity in recent years. This paper investigates the topic modeling subject and its common application areas, methods, and tools. Also, we examine and compare five frequen…
A two-stage LDA algorithm for ranking induced topic readability
2022
Probabilistic topic models, such as LDA, are standard text analysis algorithms that provide predictive and latent topic representation for a corpus. However, due to the unsupervised training process, it is difficult to verify the assumption that the latent space discovered by these models is generally meaningful and valuable. This paper introduces a two-stage LDA algorithm to estimate latent topics in text documents and use readability scores to link the identified topics to a linguistically motivated latent structure. We define a new interpretative tool called induced topic readability, which is used to rank topics from the one with the most complex linguistic structure to the one with the…
Organizational identity and competition : a study of US semiconductor industry
2017
Organisationaalinen identiteetti ja kilpailu : tutkielma puolijohdeteollisuudesta. Tutkimuksen tehtävänä on selvittää, miten kaksi samalla puolijohdesektorilla kilpailevaa yritystä, Intel Corporation ja Advanced Micro Devices, kuvailee itseänsä ja kenttäänsä suhteessa kilpailijoihinsa ja siten rakentaa ja ylläpitää organisationaalista identiteettiään. Tutkimuksen teoreettisena näkökulmana on tarkastella markkinoita sosiaalisesti rakentuneena kenttänä jota kilpailu ylläpitää. Analyysimenetelmänä on käytetty koneoppivaa tekstin mallinnusmenetelmää nimeltä Latentti Dirichlet Allokaatio (LDA), jolla …